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Abstract: 


The Bradford’s law of bibliographic scattering is a fundamental law in bibliometrics and can 
provide valuable guidance to academic libraries in literature search and procurement. However, the 
Bradford’s curves can take various shapes at different time points and there is still a lack of causal 
explanation for it, so the prediction of its shape is still an open question. This paper attributes the 
deviation of Bradford curve from the theoretical J-shape to the integer constraints of the journal 
number and article number, and extends the Leimkuhler and Egghe’s formula to cover the core 
region of very productive journals, where the theoretical journal number of which fall below one, 
fi Xp = C/X¥ < 0. The key parameters of the extended formula are identified and studied by using 
the Simon-Yule model. The reasons for the Groos Droop are explained and the critical point for the 
shape change are studied. Finally, the proposed formulae are validated with the empirical data found 
in the literature. It is found that the proposed method can be used to predict the evolution of 
Bradford’s curves and thus guide the academic library for scientific literature procurement and 


utilization. 
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1 Introduction 

1.1 Introduction 


As one of the three fundamental laws of bibliometrics, the Bradford’s law of bibliographic 
scattering has many potential applications in academic libraries. For example, it can be used to 
determine the core journals or publishers of a certain research area and thus provides guidance to 
the librarians for the procurement of journals and books (Barrantes, Dalton et al. 2023). Besides, it 
can also be used to quickly locate the key WoS research areas or IPC classes for a topic and thus 
provides assistance for the readers/librarians in the literature search (Sheikh, Zahra et al. 2022). 
However, the preparation of the Bradford’s curves takes time, especially for the trivial many journals 
with only one or two relevant papers. Worse still, the scientific literature of a certain discipline or 
research area usually increases exponentially or undergoes different developmental stages (Larivière, 
Archambault et al. 2008), so strictly speaking the Bradford’s curve prepared at any time point cannot 
be used directly many years later without adjustment. There are some mathematical formulae that 
can help predict the shape of Bradford’s curve, but they usually result in a J-shaped curve while in 
practice, the Bradford’s curve can take at least six different shapes, with the most notable being the 
S-shaped curve with the so-called Groos Droop (Groos 1967). It is suggested that there is a lack of 
causal explanation of this bibliometric law and a lack of comprehensive empirical examples 


(Wagner-D6obler 1997). Therefore, the prediction of the evolution of Bradford’s curve is still an open 


question and merit further investigation. 


This paper tries to attribute the different shapes of Bradford’s curve to the integer constraints 
of journal number and paper number. If the journal productivity n goes so high that the 
corresponding theoretical journal number f,(n) = C/n® falls below one, then the actual journal 
number can only choose between zero and one. As a result, the discrete nature of journal number 
causes the core zone to deviate from the theoretical results of Lotka or Simon-Yule model. To 
remedy this problem, this paper proposes two different formulae for the core zone and the trivial- 
many zone respectively, and key parameters of the formulae are identified and studied through 
theoretical analysis and Monte Carlo simulation of the Simon-Yule model. The reasons for the Groos 
Droop are explained and the critical point for the shape change are studied. Finally, the proposed 
formulae are validated with the empirical data found in the literature. It is found that the proposed 
method can be used to predict the evolution of Bradford’s curves and thus guide the academic library 


for scientific literature procurement and utilization. 
1.2 Literature Review 


The Bradford’s law was first proposed by Bradford in 1934 (Bradford 1934) but did not receive 
wide recognition until Vickery further develop this theory in 1948 (Vickery 1948). According to 
Bradford’s law, if we arrange journals in descending orders of their productivity and divide them 
into p groups with the same number of papers, then the number of journals in each group ni 
follows Ny: Ngir:Npy = Aiki-: kP-1, where k is a constant referred to as the Bradford 
multiplier. In addition to the above-mentioned verbal form, the Bradford’s law can also be shown 
as a J-shaped Bradford curve by plotting the accumulated productivity R(r) of the first r journals 
against the natural log of the journal rank r. The mathematical formula of Bradford’s curve was 
proposed by Leimkuhler in 1967 (Leimkuhler 1967) and the method for determining the parameters 
of this formula was published by Egghe in 1990 (Egghe 1990). In Egghe’s formula, R(r) = 
alog(1 + br), where the key parameters a and b can be calculated from the article number A, 
journal number T, and the productivity y,, of the most productive journal. Although the Egghe’s 
formula matches well with many bibliographies, it corresponds to a J-shaped Bradford’s curve 
which will inevitably deviate from those bibliographies with a Groos Droop (Egghe 1990). 
Incomplete bibliography is first believed to be the reason of the Groos Droop, but further research 
refuted this hypothesis (Qiu and Tague 1990) and it has been proved by Egghe that if the ranking of 
each journal r is transformed into r’ =r +r by adding a large constant rọ > 1/b, then the new 
curve will concave downwards and thus show a Groos Droop (Egghe and Rousseau 1988). The 
merging of different bibliographies (each with a different maximum journal productivity ye?) could 
be one possible reason for the large constant rọ (Egghe and Rousseau 1988), but it is also likely 
that the large core regions (regions of most productive journals with theoretical journal number 
fn) < 1) of some bibliographies result in the large rọ (Chen and Leimkuhler 1987). Essentially, 
the ym in Egghe’s formula denotes the journal productivity which satisfies f;(¥m) = C/yã ~ 1 
(Egghe 1985), rather than the maximum yield X, of a journal as claimed by Egghe himself. 
Therefore, there might be a core region of significant few journals, each with a theoretical journal 
number f,(X;) =C/X¥ <1 (i =1,2,---,T)), and if the total number of these journals To 
exceeds the critical value rọ = 1/b, then a Groos Droop will emerge. In this paper, the latter 
explanation is adopted and the Egghe’s formula is extended accordingly to predict the evolution of 
Bradford’s curves. 


In the 1990s, the research interest on the Bradford law has gradually shifted from the static 
presentation of data at a particular time point toward the time-dependent, dynamic and evolutionary 
aspects (Olui¢- Vuković 1998). Oluić- Vuković studied how the increase in productivity of core 
journals affected the shape of the distribution curve in the upper section over an extended time 
interval (Olui¢-Vukovié 1989). By analyzing the research output of Crotian scholars in different 
subjects, she concluded that the Groos Droop or the S-shaped curve is caused by increase in the 
concentration/dispersal disparity, which can be reflected by the increase in the core/periphery ratio 
(Oluić-Vuković 1991). The dynamic evolution of Bradford curves and the emergence of Groos 
Droop are presented in (Oluić- Vuković 1992), and other similar empirical studies through the 
temporal partitioning of bibliographies are conducted by Garg (Garg, Sharma et al. 1993), Wagner- 
Dobler (Wagner-Dobler 1997) and Sen (Sen and Chatterjee 1998). Meanwhile, stochastic models 
such as the Simon- Yule model have been increasingly employed to study the dynamic characteristics 
of bibliometric laws (Oluić - Vuković 1997, Oluić - Vuković 1998). The Simon-Yule model is 
initially introduced by Yule in 1924 for studying the distribution of biologic genera distribution by 
Species number, but does not gain widespread recognition until Simon expanded upon this theory 
in 1955 to analyze the frequency distributions of words in writing samples (Simon 1955). In addition 
to employing theoretical methods for precisely solving the constant entry rate a of new sources 
(Simon 1955), Monte Carlo simulations are utilized to explore more intricate scenarios, such as the 
declining entry rate a, of new sources (Simon and Van Wormer 1963) and the autocorrelated 
growth rate y of established journals (also referred to as the aging or obsolescence of older journals) 
(Jjiri and Simon 1977). Chen et al. (Chen 1989, Chen, Chong et al. 1994, Chen, Chong et al. 1995) 
first employed the Simon-Yule model for numerically studying the evolution of Lotka’s law and 
Bradford’s law over time. They found the entry rate of new sources a, and the autocorrelated 
growth rate of old journals y have significant yet opposite effect on the Bradford curves and thus 
offered an explanation for the various types of Bradford curves (Chen, Chong et al. 1995). Later, 
Oluié- Vuković also delved into the dynamics of Bradford distribution using the Simon- Yule model, 
but she found that the steady-state solution of this model is too restricted to cope with the variation 
produced in time, thereby limiting its applicability (Olui¢- Vuković 1997, Oluié- Vuković 1998). In 
this paper, the Simon-Yule model is also utilized to examine the effects of different scenarios on the 
key parameters (e.g., journal number To, article number Ag and maximum productivity X, of the 
core region) of the extended Egghe’s formula. However, it is not directly employed to forecast the 
evolution of Bradford curves or to compare them with empirical data. Instead, the key parameters 
are estimated from past empirical data to enhance predictions of Bradford curve evolution in the 
future. 


2 Theoretical Study 
2.1 Simon-Yule Model 


The Simon’s generating mechanism for the Bradford distribution is based on the following two 
assumptions, where the f,(n,k) denotes the number of journals that have published exactly n 
papers in the first k published papers. 


Assumption I: There is a constant probability a that the (k + 1)-th paper is published in a 
new journal — a journal that has not published in the first k papers; 


Assumption II: The probability that the (k + 1)-th paper is published in a journal that has 


published n papers is proportional to nf(n,t) — that is, to the total numbers of papers of all 
journals that have published exactly n papers. 


Therefore, if there are A papers at a certain time point, then the corresponding journal number 
T is approximately T = Aa. Based on Simon’s two assumptions, the steady-state solution of the 
Bradford distribution can be written as (Chen 1989): 


feln) = pB(n,p + 1) = p'(p + In FY (1) 


where B is the beta function, T is the gamma function, and p is a function of the entry rate of 
new source a, p = 1/(1 — a). From equation (1) it can be noted that the analytical result of the 


Simon- Yule model is in accord with the Lotka’s law as long as p ~ 1. 


In addition to the analytical solutions, Monte Carlo simulations are carried out for the case 
a = 0.1, and the results are compared with the theoretical results of equation (1), as shown by 
Figure 1. The procedures for conducting the Monte Carlo simulations can be found in Reference 
(Simon and Van Wormer 1963) and thus omitted here. In order to minimize the randomness inherent 
in this stochastic model and more accurately capture the underlying rule, each case is simulated 
N = 10000 times, and only the means of all these simulations are used as the final outputs. 


From Figure 1, it can be clearly noted that the simulation results can be divided into two zones, 
namely the normal zone (blue circles) and the core zone (red squares). This division is due to the 
fact that the actual journal number f(n) must be integers and cannot fall below one, so when the 
journal productivity n is so large that the corresponding theoretical journal number f,(n) fall 
below one, then the actual journal number f(n) will forcibly choose between zero and one, 
thereby deviating from the theoretical results, as shown by the red squares in Figure 1. Meanwhile, 
it can be noted from Figure 1(b) that although the number of journals of the core region is small, its 
contribution to the number of papers is significant. Therefore, it will be crucial to predict the 
corresponding paper number of each journal X, in the core zone in order to depict the Bradford’s 


curve accurately. 
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Figure 1 Comparisons of the theoretical and numerical results: (a) number of journals f(n) with 
productivity n; (b) number of papers nf (n) by journals with productivity n 


In order to estimate the journal productivity X, of the core region, it will be helpful to figure 
out the journal number Tọ and paper number Ag first. From Figure 1 it can be noted that the 
journal number f(n) and paper number nf (n) of the normal region matches very well with the 


theoretical results, so the total journal number T, and total paper number A, of the normal region 
can be directly obtained by summing up all the journal and paper number, T, = ©?™, f(n), and 
A, = $X" nf (n), where ym is the journal productivity when f(y) ~ 1. According to Equation 
(1), the analytical expression of the ym can be derived as: 


1 


Ym = [ACP — Dr + 1))P*t (2) 


After Ym is calculated, then the total number of journals Tọ and papers Ag of the core 
region can be calculated by Tọ = T — T) and Ag = A — Aj. Or, the total number of journals To 
and papers Ag can be directly calculated by: 


+00 Ym 
Zy d z= — 

Ty f _ Ifa) dn =" (3) 
+00 2 

ee i _Tnf(n) dn = on - (4) 


where Ym is calculated by Equation (2). 
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Figure 2 the journal productivity of the core region as functions of the journal rank: (a) the journal 
productivity X»; (b) the journal productivity ratio X,/X, 


In the Simon-Yule model with constant entry rate a, It is found that the largest paper number 
one journal can have X, can be estimated by using the Gumbel’s r-th characteristic extreme theory 
(Glanzel 2010, Glanzel 2013), which is: 


+ 


GA) = f roa=" 6) 


Xr 


By solving this equation, it can be derived that the productivity of the most productive journal 


X, can be written as: 


1 Ee 
X = [AT + DIP = (9-1) P Yn (6) 


whereas the productivity of the r-th most productivity journal X, is related to X, by X, = 
X,r~1/?. The comparison of the Gumbel’s r-th characteristic extreme values with the mean of the 
simulation results are shown in Figure 2, from which it can be noted that though the Gumbel’s 


characteristic extreme theory can be used to predict the largest paper number X4, it cannot be used 
to estimate the other paper number in the core zone, X;, r = 2,3, +, To. Therefore, other methods 
must be used and it is assumed in this paper that all other X, (r = 2,3,:--, To). are related to the 
largest paper number X, through the following equation: 


Xı 

s—=k(r-1)+1 (7) 
Xy 

where k is the only parameter waiting to be determined. This equation can be derived from the 
Equation (8) of Reference (Chen 1989), by assuming both (r; — rı)/(rı +b) and —1/c are 
relatively small. The validity of this equation can also be directly observed from Figure 2(b), where 
the blue circles denote the simulation results while the blue dashed lines denote the linear fitting 
results. Therefore, the productivity of the r-th most productive journals can be derived from 
Equation (7), and the accumulated productivity of the first r most productive journals can be 


written as: 


— : Xi 
R.(r) = ress (8) 


If there are Tọ journals with Ag papers in the core region and the numbers of Tọ and Ao 
are known, then the parameter k can be calculated from the equation R,(T>) = Ao. Then Equation 
(8) can be used to predict the evolution of the core regions (r < To) of the Bradford’s curves. 


2.2 Egghe’s formula 


After removing the Ty journals and Ag papers of the core region, the rest T, journals and 
A, papers match well with the theoretical results predicted by Equation (1), and therefore, they 
follow the Lotka’s law and their Bradford curve can be predicted with the revised Leimkuhler and 
Egghe’s formula, which can be written as (Egghe 1990): 


R(%) = alog(1 + br,) (9) 


where the key parameters a and b areas follows: 


A 
a = ——— (10) 
log(e’ ym) 
eYym— 1 


where y is the Euler’s number, y ~ 0.5772, Ym is the journal productivity when the 
corresponding theoretical journal number f(ym) =~ 1. Ym can be directly calculated from 
Equation (2), but can also be estimated from the following equation if all the values of X1, To 
and Ag are known: 


Xı 


ha Del (12) 


Ym 


Since the journal productivity of the core region is higher than the normal region, so the journal 
rank of these significant journals are lower than the normal ones. As a result, the Bradford curve of 


the normal region will start at the point (Tọ, Ag) and every ranking r, of the normal region should 
be transformed into r = r, + To, and the accumulated productivity of the first r journals R(r) 
should be transformed into R,,(r) = R(r1) + Ao. Then the revised Egghe’s formula for the normal 
region can be written as: 


Ra (r) = R(r — To) + Ao = alog[1 + b(r — To)] + Ao (13) 


The revised Egghe’s formula of Equation (13) can be used to predict the dynamic evolution 
of the normal regions (Tyg < r < T) of the Bradford’s curve. Thus, it can be noted that Equations 
(8) and (13) can be used together to predict the dynamic evolution of the Bradford’s curves. 
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Figure 3 the cause of the Groos Droop and the Bradford’s curve evolution. (a) the cause of the Groos 
Droop; (b) the evolution of Bradford’s curve 


The Bradford’s curve is shown in Figure 3(b), from which it can be noted that the blue circles 
denote the normal zone, whereas the red squares denote the core zone. Meanwhile, the blue dashed 
lines denote the prediction results of Equation (13), whilst the red dotted lines denote the prediction 
results of Equation (8). The three black squares denote the (1, X1), the (To, Ao) and the (T,A) 
respectively. From the above discussion, it can be noted that the three points and two lines are most 
important parts for predicting the Bradford’s curve evolution. 


2.3 Groos Droop 


The Groos is the first to note that when the productivity is low, then the curve with bend 
downward (Groos 1967). The cause of the Groos Droop has been explained by Egghe (Egghe and 
Rousseau 1988) as merging datasets, but it has been shown in this section that it is the existence of 
core region that causes the Groos Droop in the normal region. 


The first and second derivatives of R,(r) of the core region can be derived from Equation 


(8): 
OR. (r) _ Xr 
0(logr) ~k(r—1)4+1 (14) 
3R) -X,(1-k)r = 


a(logr)? [k(r — 1) +1]? 


d?R-(r) 


From Equation (15), it can be noted that when k > 1, then Ilog)? 


< 0 and the Bradford’s 


d?Re(r) 


adorn)? > 0 and the 


curve of the core region will concave downward, whereas when k < 1, then 


Bradford’s curve of the core region will concave upward. As the entry rate of new sources a 
increases, the journal number T will increase and the distributions of articles will become more 
dispersed. As a result, the largest journal productivity X, will decrease. Meanwhile, it can be noted 
from Equation (8) and R,(T)) = Ao that lower X, indicates lower k if Ago is relatively 
constant. Therefore, as œ increases, the Bradford’s curve of core region will gradually become 
concave upward, as shown by Figure 3. 


Similarly, the first and second derivatives of R,,(r) of the normal region can be derived from 
Equation (13): 


OR, (r) 2 abr 
ô(logr) b(r—T)) +1 (16) 
a?R,(r) _ ab(1— bTy)r 
Sogn)? ~ [br —T) + 1 (17) 
a*R-(r) 


From Equation (17) it can be noted that when Ty > 1/b then (lor)? 


< 0, the Bradford’s 


curve of the normal region will concave downward and thus show a Groos Droop, whereas when 


3? Rer) 
d(logr) 


To < 1/b then > 0, the Bradford’s curve of the normal region will concave upwards and 


thus show a J-shaped curve. As the entry rate of new sources œ increases, it can be noted from 
Figure 3(a) that the Tọ will gradually fall below 1/b and thus the Bradford’s curve of the normal 
region will eventually become concave upward, which is similar to the case of the core region. 


The variations of key parameters Ty, 1/b and k with the entry rate œ are shown in Figure 
3 (b), from which it can be noted that when A = 104, the normal region will turn concave upward 
at critical point æn ~ 0.2 whilst the core region will turn concave upward at critical point a, ~ 
0.3. Therefore, when æ < 0.2, the whole Bradford’s curve will concave downward, and when a > 
0.3, the whole Bradford’s curve will concave upward. When 0.2 < a < 0.3, the whole Bradford’s 
curve will show a reversed S shape, with the core region concave downward and the normal region 
concave upward. The three shapes of Bradford’ curves are shown in Figure 3 (a). In this particular 
case, since æn < @e, then there is no S-shaped Bradford’s curve. This is because the aging of the 
sources are not considered here so the largest journal productivity X, is relatively large. When the 
effect of auto-correlation is considered, as discussed in Section 3, the corresponding X, will 
decrease significantly, which results in a lower k and thus a much lower a@,. When a, < ay, then 
the S-shaped Bradford’s curve will appear if a, < æ < a, with the core region concave upward 


and the normal region downward. 
2.4 Bradford Dynamics 


Since the analytical expression of Tọ, Ag and X, (Equations (3), (4) and (6)) are all 
known, the core region of Bradford curve at any time can be predicted by using Equation (8). Since 
the key parameters of the normal regions can be derived from the above three factors through T, = 
T — Tọ, Ay = A — Ao and Equation (12), the normal region of Bradford curve can be predicted 
by using Equation (13). From Figure 3 it can be noted that when a = 0.1, then the whole 


Bradford’s will concave downward when 10? < A < 10*. The results shown in Figure are in 
accord with the theoretical prediction. The key parameters as functions of the journal paper number 
are shown in Figure 4(b), from which it can be noted that the theoretical results match with the 
numerical ones very well. It can also be noted that all these key factors are linear functions of the 
paper number A, therefore, Equation (18) will be used for studying the variations of Tọ, Ag and 


X, for the more complicated scenarios. 
log(Y) = a, + b,log(A) (18) 


where the constant a, and b, are functions of p. Since p is approximately one, then a, and 
bp can be viewed as constants. 


The Bradford’s curves of the constant entry rate scenario are shown in Figure 4(a), where the 
red squares and blue circles denote the simulation results of the core and normal regions respectively, 
whereas the red dashed lines and blue dotted lines denote the theoretical results of Equations (8) 
and (13) respectively. The three key points of (1,X,), (To, Ao) and (T,A) are also shown as 
black upper triangle, diamond and lower triangle respectively in Figure 4(a). From Figure 4(a) it 
can be noted that the although the core region includes far fewer journals than the normal region, 
its share of representation is significant due to the log scale of x-axis. Therefore, journals with lower 
ranks are better represented in Figure 4(a). 
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Figure 4 the dynamics of the Bradford’s curve and the variation of the key parameters. (a) the 
dynamics of the Bradford’s curves; (b) the variations of the key parameters 


The simulation and analytical results of the Tọ, Ag and X, are shown in Figure 4(b), where 
the hollow symbols denote the simulation results whereas the solid symbols denote the analytical 
ones. From Figure 4(b) it can be noted that the three key factors are all linear functions of the paper 
number A in the log-log axis, just as Equation (18) indicates. It is also notable that the analytical 
results match very well with the numerical ones, which verifies the validity of Equations (3), (4) 
and (6). The analytical result of Tọ is slightly lower than the numerical ones because the Ym is 
higher than the numerical results. 


3 Numerical Study 
3.1 Decreasing Entry Rate 


If we assume the probability of adding a new journal decreases linearly with the time. 


a(i) =a, —ki (19) 


where k is a constant, k = (as = ar) j. Af, where as and Ap are the entry rate of new sources 
and the total article number in the final state respectively. Then the accumulated journal numbers 


can be written as: 


A 
1 
T =) a@ = aA —5kA? (20) 
i=1 


According to Equation (20), quadratic fitting of T and A can be used to determine the value 
of a, and ay. After the a, and a@y are determined, the @ = (as +aç)/2 can be used to 


calculate the analytical results. 


The Bradford’s curve and variations of key parameters are shown in Figure 5, from which it 
can be noted that the proposed method can still predict the variations of Bradford’s curves well. In 
general, the analytical results of the @ matches well with the simulation results. The numerical 
results of Ag and X, are slightly lower than the analytical ones. Therefore, the decreasing of entry 
rate has an additional negative effect on the increase of Ag and Xj. 
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Figure 5 the dynamics of the Bradford’s curve and the variation of the key parameters (a) when the 


entry rate decreases from 0.3 to 0.2; (b) when the entry rate decreases from 0.2 to 0.1. 


From Figure 5, it can also be noted that the effect of decreasing entry rate on the shape of 
Bradford’s curve and the variation of key factors are relatively insignificant. So, it is still possible 
to use the analytical results of constant entry rate @ for predicting the key parameters without 


introducing too much error. 
3.2 Decaying Rate 


Simon's assumption is that only one paper gets published in each time period. He models the 
probability of a journal increasing in size in the next period as being proportional to a weighted sum 
of its past increments. These increments are weighted by a factor that decreases geometrically over 
time, with the rate of decrease denoted as y. 


Let y;(k) be the change in size of the jth journal during the kth time interval, where y,(k) 
is either unity or zero (the journal either experiences a unit increment in size or remains the same 
size during any given time interval). Then the size of the jth firm at the end of the kth interval is 
simply )*_, y;(t) The expected increment in size of the jth firm during the (k + 1)th interval is: 


k 
1 k-T 
ply +0 = t=) yO (21) 
k 
T=1 
where W, isa function of time that is the same for all journals, Wg = = w;(k), where w;(k) = 


ye yj) y*-t, and y is the fraction that determines how rapidly the influence of past growth on 
new growth dies out., 


Figure 6 illustrates the impact of the decaying rate y on the dynamics of Bradford’s curves 
and the variations of key parameters. It's evident that the aging term notably increases Ty, causing 
the normal region to become more concave downward. The aging term also markedly reduces X, 
by undermining the Mathew effect. As old article sources lose their appeal for new papers, the 
"success breeds success" effect diminishes, resulting in a significant decrease in X4, as depicted in 
Figure 6(a). Consequently, while the article number Ay in the core zones remains relatively 
unchanged, Tọ needs to increase to compensate for the reduction of X,. This expansion of the core 
region enlarges its share of the Bradford’s curve, tending to shape it into a J-shape due to the 
reduction of k and X4, as discussed in Section 2.3. Additionally, with the significant increase in 
To, it's more likely that Tọ will exceed 1/b, further contributing to the concave downward shape 
of the normal region. In summary, the decaying rate facilitates the Bradford’s curve to adopt an S- 
shaped form more easily. 
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(b) a=0.1,y = 0.95~1.0 


Figure 6 the dynamics of the Bradford’s curve and the variation of the key parameters. (a) when the 
entry rate is 0.1 and the decaying rate is 0.95; (b) when the entry rate is 0.1 and the decaying rate 
increases from 0.95 to 1.0 


3.3 Varying Decaying and Entry Rate 


Figure 7 illustrates the impact of changing decaying and entry rates on Bradford’s curve. In 
real-world scenarios, both the entry rate of new sources and the decaying rate often fluctuate. For 
instance, the entry rate might linearly shift from 0.3 to 0.1, while the decaying rate linearly shifts 
from 0.95 to 1.0. Observing Figure 7, it becomes apparent that increasing the decaying rate and 
decreasing the entry rate generally exert opposite effects on Bradford’s curve. 


When the entry rate a decreases and the decaying rate y varies or remains constant, the 
largest journal productivity X, also diminishes due to reduced Mathew effect. Consequently, the 
core region is likely to exhibit a J-shaped curve, resembling scenarios with constant a and varying 
or constant y. However, the total article number Ag in the core region significantly deviates from 
analytical results, while the total journal number Tọ in the core region remains relatively consistent 
with analytical predictions—similar to scenarios involving decreasing a and opposite to scenarios 
with y. As a result, the normal region of Bradford’s curve becomes less concave downward, with 
its starting point on the y-axis notably lower, resembling trends seen with decreasing entry rates 
rather than decaying rates. Notably, all three key factors continue to exhibit linear relationships with 
article number, suggesting Equation (18) remains viable for predicting them. 
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(b) a = 0.3~0.1,y = 0.95~1.0 


Figure 7 the dynamics of the Bradford’s curve and the variation of the key parameters. (a) when the 
entry rate decreases from 0.3 to 0.1 and the decaying rate is 0.95; (b) when the entry rate decreases 
from 0.3 to 0.1 and the decaying rate increases from 0.95 to 1.0. 


4 Empirical Study 
4.1 Dataset of Croatian Chemistry Research 


The research output in chemistry of authors from Croatia was used by Oluić- Vuković for the 
preparation of full bibliographic references for a ten-year period (Olui¢- Vuković 1992). Only 
articles published in journals were taken into consideration and the dataset comprises 2543 papers 
published in 416 journals over a 10-year interval. The journal productivity of the first few (less than 
10) most productive journals was directly taken from Figure 1 of Reference (Olui¢- Vuković 1992) 
whilst the productivity of the other journals was directly taken from Tables 4 and 6 of Reference 
(Oluié- Vuković 1998). Data taken from the figures were adjusted to comply with the total number 
of journals and articles obtained from the Table 2 of Reference (Olui¢- Vuković 1992). 


In order to predict the dynamics of the Bradford’s curve, the first step is to predict the variation 
of total article number A(t) with time t. Logistic regression analysis (Verhulst 1838) is applied to 
the empirical data so that the total article number A(t) at any time can be predicted, as shown in 
Figure 8(a). After that, the total journal number T and the entry rate of new sources a can be 
estimated by plotting the total journal number T against the total article number A and applying 
the linear fitting of T = Aa, as shown in Figure 8(b). After the point (T, A) is determined for any 
time, then linear regression of Equation (18) is employed to determine the three key parameters 
of Ty, Ag and X, in the log-log axis, and the fitting results are shown in Figure 9(a). Then the key 
points of (Tọ, Ao) and (1,X,) at any time can be determined. Finally, based on the three key 
points, Equations (8) and (13) can be used to determine the Bradford curve for the core region 
and the normal region respectively, and the results are shown as the red dashed lines and blue dotted 


lines in Figure 9(b) respectively. 
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Figure 8 the dynamics of the Bradford’s curve and the variation of the key parameters. (a) the 
dynamics of the Bradford’s curves; (b) the variations of the key parameters 
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Figure 9 the dynamics of the Bradford’s curve and the variation of the key parameters. (a) the 
dynamics of the Bradford’s curves; (b) the variations of the key parameters 


From Figure 9(b) it can be noted that the Bradford’s curve gradually turns from J-shape into S- 
shape, and this transition is well captured by the analytical prediction. 


4.2 Dataset of Solar Power Research 


The bibliographies on solar power research for the year 1971, 1974, 1977, 1980, 1983 and 
1986 for the paper published in journals from Engineering Index were prepared by Garg et al. (Garg, 
Sharma et al. 1993). The data used in the following analysis are directly taken from the Tables 1~7 
of Reference (Garg, Sharma et al. 1993). 


Similar to the case of the Croation Chemistry Dataset, the prediction of the dynamics of the 
Bradford’s curve also includes the following four steps: 


1. Applying the logistic regression fitting to the empirical data of cumulative article number vs 
time (obtained From Table 7) to obtain the prediction of article number of the desired intervals, 
as shown in Figure 10(a). 


2. Applying the quadratic fitting of Equation (20) to the journal and article pairs to obtain the 
estimated journal number T and entry rate of new sources æ at any article number A, as 


shown in Figure 10(b). 


3. Applying the linear fitting of Equation (18) to the empirical data of Tọ, Ag and X, in the 
log-log axis to obtain their predictions at any article number A, as shown in Figure 11(a). 


4. Applying the Equations (8) and (13) to draw the Bradford’s curve of the core region and 
normal region respectively, as shown in Figure 11(b). 


8000 T T T T T T T T T 300 


O Empirical Data O Empirical Data 
7000 | | E Logistic Fitting m.. —S— Quadratic Fitting 
250 F 


200 F 


150 F 


100 F 


Cumulative Journal Number 7 


50 F 


1 1 f f 1 1 f 1 f 
ie) 200 400 600 800 1000 1200 1400 1600 1800 2000 
Cumulative Article Number 4 


Figure 10 the dynamics of the Bradford’s curve and the variation of the key parameters. (a) the 
dynamics of the Bradford’s curves; (b) the variations of the key parameters 
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Figure 11 the dynamics of the Bradford’s curve and the variation of the key parameters. (a) the 
dynamics of the Bradford’s curves; (b) the variations of the key parameters 


From Figures 9(b) and 11(b) it can be noted that though our suggested method can roughly 
predict how Bradford’s curves change over time, there are still some errors present. This is because 
Bradford’s law inherently contains uncertainties or errors. Our numerical analysis shows that the 
number of articles for each journal rank has a large standard deviation, making it practically 
impossible to precisely predict the shape of Bradford’s curve. Additionally, our method involves 
several fitting procedures, introducing errors in the process. Thus, while our approach can give a 
general idea of how Bradford’s dynamics might unfold, it cannot accurately forecast the number of 


articles for each journal at any given time. 


Another issue with our method is that the first derivatives of the core region (Equation (14)) 
and the normal region (Equation (16)) differ at the point (To, Ao). This leads to the analytical 
curve being jagged at the intersection point, whereas the numerical simulation results show 
smoothness throughout. One potential solution to this problem is to propose a more complex 


formula for the normal region, but this would inevitably complicate the entire method. Given the 


difficulty in precisely predicting Bradford’s curve dynamics, this aspect is not explored further in 


our research. 
5 Conclusion 


In this paper, we delve into how integer constraints, specifically the limits imposed by the 
number of journals T and articles A, affect the shape of Bradford’s curve. We categorize 
Bradford’s curve into two zones: the core zone and the normal zone, based on whether the integer 
effect plays a significant role. Utilizing the Simon-Yule model, we derive analytical results for key 
parameters and distributions under constant entry rate conditions. Then, we formulate theoretical 
equations for each zone and analyze the reasons behind the diverse shapes of Bradford’s curves. 
Monte Carlo simulations help us explore how decreasing entry rates of new sources and decay rates 
impact the curve's shape and key parameters. Finally, we validate our approach using empirical data 
from Croatian Chemistry and Solar Power datasets, demonstrating its ability to predict Bradford’s 


curve dynamics. From our findings, we draw several conclusions: 


1. Bradford’s curve should be divided into distinct zones based on the significance of integer 


constraints, each requiring separate formulae. 


2. The shape of Bradford’s curve can take on four different forms, determined by the second 


derivatives of the core and normal zones. 


3. Key parameters such as the maximum productivity X4, journal number To, and article number 
Ag play crucial roles in shaping Bradford’s curves, with entry rate and decay rate changes 


influencing these parameters. 


4. Despite some errors, our proposed four-step method can effectively predict general trends in 
Bradford’s curves. 


These insights can guide academic libraries in procuring and utilizing scientific literature 
effectively. 
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